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ABS TRAC T 



A lthough Missouri has had a Career Ladder program for teachers since 1987, very 
little research has been carried out to measure the program’s effects and what has 
been studied has not been comprehensive. This paper examines the program's 
effect on student achievement across the state, using longitudinal data on district math and 
reading scores for 524 Missouri school districts over a nine-year period. Our primary 
specification compares achievement levels in participating districts with a matched group of 
non-participating districts. We also applied alternative specifications to identify the impact 
of the program, for example controlling for prior district scores and measuring variations in 
district participation over time to identify effects of the program within a given district. 
Across the range of specifications, the estimated effects of the Career Ladder program range 
from small positive effects to no effect in both math and reading. We conclude that if the 
Career Ladder has a positive impact on test scores, it is probably very small. 




I. Background 



A. Policy Problem and Research Question 

Public school teachers are usually paid according to two objective criteria alone: their 
years of experience and their educational attainment (certificates, degrees, or coursework). 
This system, known as the uniform salary schedule, has received criticism for being unfair, 
because it does not reward effort or skill, and for being inefficient, because it does not 
encourage hard work or attract talent (Hanushek 1981). 

Education policymakers seeking to reform the way teachers are paid have tried many 
times, often without success, to tie teacher compensation more closely to the quantity and 
quality of teachers’ work. An influential 1983 report by the National Commission on 
Excellence in Education, entitled A Nation at Risk, shined a spotlight on the problem and 
spurred a wave of reforms during the mid- to late 1980s. Many of the reforms included 
career ladders for teachers, which allowed teachers to advance in salary based on factors 
other than seniority such as demonstrated skills or performance. Elowever, most of the 
reforms enacted in the late 1980s did not last very long (Glazerman 2004). This study 
focuses on an important exception, a teacher career ladder program that the State of 
Missouri started in 1986 and that continues to operate more or less unchanged. 

One goal of Missouri’s Career Ladder program is to improve student achievement by 
offering teachers opportunities to earn extra pay for extra work and professional 
development, where eligibility for these opportunities is based on a combination of seniority 
and subjective performance evaluation. The policymakers who established the program 
hoped that the incentives created by the availability of such opportunities as well as the 
activities themselves improve academic services, programs, and learning outcomes for 
students, in part by attracting and retaining effective teachers. 

This paper is part of a larger study that focuses on Missouri’s Career Ladder Program as 
a whole to and find out how it really works and whether it is achieving the goals mentioned 
above. The study posed the following broad research questions: 

1 . How does the program operate in theory and in practice? 
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2. What effect does the Career Ladder have on student achievement? 

3. What effect does the Career Ladder have on teachers’ career decisions, 
specifically their decisions to stay in their school district or to remain in the 
teaching field? 

This paper addresses the second research question above, while two companion reports 
(see Silman et al. forthcoming and Booker and Glazerman forthcoming) address the other 
two questions. 

To date, policymakers have very little evidence on which to base answers to these three 
questions. The only published evidence on the effectiveness or even the operation of the 
Missouri Career Ladder program that we could find was limited to two reports on early 
program implementation (Schofer et al. 1987; Taylor and Madsen 1989), two single-district 
studies also from the early years of the program (Ebmeier and Hart 1992; Henson and Hall 
1993) and a brief set of tabulations by the Missouri Department of Elementary and 
Secondary Education (DESE) using 1999 test score results for a subset of the state’s 
districts. Our analysis of statewide test scores covers 10 years of data, substantially updates 
the DESE analysis, and explicitly accounts for observable differences between Career 
Ladder and non-Career Ladder districts. 

Across the country, policymakers have little rigorous evidence on the effectiveness of 
teacher incentive programs in general. Reviews by Glazerman (2004), Goldhaber and 
Anthony (2007), and Podgursky and Springer (2007) indicate that attempts to study teacher 
incentive programs rigorously are frequently thwarted by the early termination of the very 
programs being studied. Arizona appears to be the only state besides Missouri to have a 
career ladder program that has lasted since the 1980s. Dowling et al. (2007) have studied the 
effects of Arizona’s Career Ladder Program on student achievement. Their study design 
compared student performance in participating districts with performance in a matched set 
of comparison districts over a two-year period. They found positive impacts on test scores 
in math, reading, and writing even over the short period they examined. It is worth noting, 
however, that the Arizona and Missouri programs differ in at least one major respect: In 
Arizona the career ladder plans allow student achievement to be considered in determining 
teacher pay, whereas in Missouri the plans do not. 

B. Overview of the Missouri Career Ladder Program 

As background, we describe the program as it operates, based on available program 
documents and published literature, and to a lesser extent how it operates in practice. 
Silman et al. (forthcoming) present more in-depth findings on program operations based on 
first-hand data collected for this study. 

1. Program Structure and Operation 

Through the Career Ladder program, teachers who meet statewide and district-level 
performance criteria are eligible to receive supplementary pay for meeting Career Ladder 
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responsibilities, which can be extra teaching work or participation in professional 
development. The program does not replace the regular salary schedule. Career Ladder 
responsibilities must be academic in nature and directly related to the improvement of 
programs and services for students. 

A teacher moves up the Career Ladder in three stages. To move up the ladder, teachers 
are assessed at each stage through periodic observations and evaluations of documentation. 
Each successive stage offers the opportunity to receive more supplementary pay for Career 
Ladder responsibilities: up to $1,500 for Stage I, $3,000 for Stage II, and $5,000 for Stage III. 
Out of more than 65,000 teachers in 524 districts statewide, more than 17,000 teachers (26 
percent) from 333 districts (64 percent) participated in the Career Ladder program during 
the 2005-06 school year. 

The Missouri Career Ladder has the distinction of being the most mature teacher 
compensation reform program in the country. It came into being in 1985 and has outlasted 
dozens of programs that were introduced around the country at the same time. Missouri’s 
program is unusual in the way it mixes teacher performance, tenure, and extra 
responsibilities to define salary supplements. To advance up the Career Ladder teachers must 
meet certain tenure requirements and show progress in their performance as rated by 
classroom observers, yet the bonuses are actually given for the extra responsibilities they 
carry out. The Career Ladder advancement accounts for only the amount of extra 
responsibility and the rate at which the extra work is compensated. 

2. District Participation 

Missouri’s program is available statewide but districts must choose whether they will 
participate and, if so, they must provide matching funds. Districts that choose to participate 
must submit a District Career Ladder Plan (DCLP) to the Missouri Department of 
Elementary and Secondary Education (DESE). DESE approves plans that meet state 
guidelines for improving academic services and programs for students. DCLPs must be 
aligned with a statewide Missouri School Improvement Program. They also must include 
curriculum development plans, professional development plans for teachers, guidelines for 
teachers’ Career Development Plans, and an instmment for Performance-Based Teacher 
Evaluation. 

While all participating districts must contribute matching funds for the program, poorer 
districts receive a higher percentage of matching funds from the state. Every year the state 
ranks districts according to their per-capita income, and based upon this ranking the state 
covers 40 percent of Career Ladder program costs for districts in the top quartile, 50 percent 
of costs for districts in the next quartile, and 60 percent of costs for districts in the bottom 
half. Some districts may decide not to participate in because they are unable to afford their 
share of the program costs despite this graduated matching rate. 
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3. Teacher Eligibility and Qualifications for a Bonus 

To enroll in the Career Ladder and qualify for bonuses, each teacher must develop her 
or his own Career Development Plan (associating each Career Ladder responsibility with 
either a designated plan or some other instructional improvement). The district Career 
Ladder Review Committee, which is made up of educators (selected by teachers) and 
administrators, must then approve the teacher’s development plan. Through scheduled and 
unscheduled observations, as well as reviews of their Career Development Plan and other 
documentation such as lesson plans, the teacher must show evidence of performance at or 
above the expected level on 20 criteria listed in the district’s Performance-Based Teacher 
Evaluation (PBTE) instrument. The criteria span these six areas: (1) engaging students in 
class, (2) correctly assessing students, (3) exhibiting content knowledge, (4) showing 
professionalism in the school, (5) participating in professional development, and (6) adhering 
to the district’s education mission. There are also specific qualification criteria for each stage 
of the Career Ladder, as follows: 

• Stage I. To qualify for Stage I, a teacher must have five years of teaching 
experience in the state and have performed at the “expected” level or above on 
all criteria on the most recent final evaluation instrument of the PBTE. 

• Stage II. To qualify for Stage II, a teacher must have completed two years of 
service at Stage I of the Career Ladder. The district may waive one year of 
service at the previous stage if the teacher has spent seven years teaching in 
Missouri. The teacher also must have performed at the “expected” level or 
above on all criteria, and above the expected level on at least 10 percent of the 
criteria on the most recent final evaluation instrument of the PBTE. 

• Stage III. To qualify for Stage III, a teacher must have completed three years 
of service at Stage II of the Career Ladder. The district may waive two years of 
service at the previous stage if the teacher has spent a total of 10 years teaching 
in Missouri’s public schools. The teacher also must have performed at the 
“expected” level or above on all criteria, and above the expected level on at least 
15 percent of the criteria on the most recent final evaluation instrument of the 
PBTE. 

To receive a salary supplement, teachers must spend a specified amount of time on a 
certain number of responsibilities outside of their contracted time. Examples of the extra 
responsibilities that Career Ladder teachers undertake include doing extra work, such as 
providing students with opportunities for enhanced learning experiences, remedial 
assistance, and various extended day/year activities, and participating in professional 
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development, such as taking college classes, attending workshops, and participating in 
professional organizations. 1 

The district’s Career Ladder Review Committee evaluates the teachers to determine if 
they have carried out their responsibilities and should receive supplementary pay. Almost all 
Career Ladder teachers do receive this supplementary pay. The minimum time teachers must 
spend on these responsibilities in a given year is determined by their stage on the Career 
Ladder, as follows: 

• Stage I teachers must spend a total of at least 60 hours on at least two 
responsibilities 

• Stage II teachers must spend a total of at least 90 hours on at least three 
responsibilities 

• Stage III teachers must spend a total of at least 120 hours on at least four 
responsibilities. 

In the 2005-06 school year, an average of 79 hours were spent by Stage I participants, 
111 hours were spent by Stage II teachers, and 144 hours were spent by Stage III teacher. 
These hours approximately translate to supplementary pay at $19, $27, and $35 per hour, 
respectively, for Stages I, II, and III, somewhat lower than the nominal hourly rates that 
would be earned by doing the minimum requirement: $25, $33, and $42 per hour. The 
bonus amounts have never been increased or adjusted for inflation since the program was 
established in 1985. 

District participation in the Career Ladder program has grown steadily since it started in 
1986-87, although it grew the most rapidly in the program’s early years. Table 1 shows the 
history of the Career Ladder program. The table’s first column shows that the number of 
districts participating rose dramatically in the program’s first six years, from 63 districts in 
the first year to 204 districts by 1992-93. After 1995-96 growth slowed, with the total of 333 
districts participating in 2005-06 reflecting an increase of 47 districts over a ten-year period. 
Similar patterns hold for growth over time in the number of teachers participating and in the 
total state payments made for the program. 

In this study we sought to identify the Career Ladder program’s impacts by examining 
changes between districts before and after they begin implementing the program. One of 
the challenges we faced is that while the program has been in operation since the 1986-87 
school year, Missouri did not start state-wide standardized testing in math and reading until 
1997-98. By that time, 288 districts were already participating in the program, and only 18 
districts stopped participating after 1997-98, so there is little opportunity to compare student 
achievement in those districts before and after participation. During the recent nine-year 



1 DESE recommends that teachers should not spend more than one-third of Career Ladder hours on 
college classes and workshops. 
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period for which standardized test score data is available, only 66 districts switched 
participation status. Thus, the majority of variation in participation status occurs cross- 
sectionally between districts. 



Table 1 . History of Career Ladder Program 



Year 


Number of Districts 
Participating 


Number of Teachers 
Participating 


Total State Payment 


1986-87 


63 


2,400 


$2,624,025 


1987-88 


121 


5,074 


$7,182,975 


1988-89 


147 


5,811 


$10,484,500 


1989-90 


177 


6,803 


$13,839,075 


1990-91 


192 


7,580 


$16,688,675 


1991-92 


199 


8,322 


$18,902,575 


1992-93 


204 


8,536 


$20,362,750 


1993-94 


229 


10,696 


$24,426,950 


1994-95 


269 


13,021 


$29,300,325 


1995-96 


286 


14,107 


$33,358,250 


1996-97 


278 


13,741 


$34,312,899 


1997-98 


288 


14,098 


$35,799,849 


1998-99 


299 


14,707 


$37,333,522 


1999-00 


309 


15,827 


$37,687,074 


2000-01 


322 


16,688 


$37,993,100 


2001-02 


330 


17,101 


$38,253,625 


2002-03 


338 


17,412 


$38,599,500 


2003-04 


332 


16,982 


$37,103,360 


2004-05 


328 


16,919 


$36,465,400 


2005-06 


333 


17,378 


$36,986,803 



Soure: Table contains data from the Missouri Career Ladder program 2005-06 annual report, 

produced by the Missouri Department of Elementary and Secondary Education. 




II. Data 



D ESE provided us with district-level data on average math and reading scores. 
Career Ladder participation, and a broad range of demographic and other variables. 
The test score data cover nine years and nearly all of the 524 districts in the state. 
The Missouri Office of Social and Economic Analysis provided us with additional district- 
level census data. 

Comparing mean district characteristics, we find that districts participating in the Career 
Ladder program are on average smaller, more white, more economically disadvantaged, and 
more rural than districts that do not participate. Table 2 compares the mean characteristics 
of participating and non-participating districts, at the beginning and the end of the analysis 
period.' Districts that were participating had much lower average enrollments than districts 
that were not: In 1997-98, average enrollments were 1,108 students (participating districts) 
and 2,411 students (non-participating), and in 2004-05 enrollments were 1,164 (participating) 
and 2,513 (non-participating). The median participating district (585 students) was also 
smaller than the median non-participating district (762 students) in 2004-05. Participating 
districts were also more likely to be rural (75 percent) than non-participating districts (62 
percent) in 1997-98. 

Districts that participate in the Career Ladder program are more predominantly white 
than non-participating districts, with an average of 97 percent white in participating districts 
in 1997-98, compared to 91 percent white in non-participating districts. The size of the gap 
is relatively constant over the analysis period, as the African-American and Hispanic 
percentages rise for both groups by 2004-05. The percent of students who are economically 
disadvantaged, as measured by free or reduced-price lunch eligibility, is greater for 
participating districts, with a difference of 44 percent (participating) to 38 percent 
(nonparticipating) in 1997-98 and 50 percent to 45 percent, respectively, in 2004-05. This 
mirrors the characteristics of the districts’ overall populations. Districts that participate have 



2 Nine districts participated in the Career Ladder program for at least one year during the analysis period 
on a limited basis, in order to reward National Board Certified teachers. In our analysis we do not include 
those districts in the sample of participating districts. 




a higher percentage than other districts of households that are considered poor, a higher 
percentage with no college education, and a lower median income. 



Table 2. Characteristics of Participating and Non-participating Districts 





1997-98 


2004-05 


Average Characteristics 


Participating 

Districts 


Non- 

participating 

Districts 


Participating 

Districts 


Non- 

participating 

Districts 


Enrollment 


1,108 


2,411 


1,164 


2,513 


Percent white 


96.7% 


91.1% 


94.9% 


88.4% 


Percent African-American 


2.0% 


7.6% 


2.6% 


9.1% 


Percent Hispanic 


0.7% 


0.7% 


1 .6% 


1 .6% 


Percent economically 
disadvantaged 


43.5% 


38.4% 


49.5% 


44.6% 


Teacher annual salary 


$27,939 


$28,511 


$33,569 


$34,889 


Teacher experience level 


12.1 


12.4 


12.2 


12.5 


Student-teacher ratio 


13.1 


13.2 


12.2 


12.5 


Percent in a large or mid-size 
city 


8.0% 


19.9% 


7.5% 


22.8% 


Percent in a large or small town 


17.5% 


18.2% 


1 6.3% 


20.3% 


Percent in a rural area 


74.5% 


61 .9% 


76.2% 


56.9% 


Percent urban 


15.5% 


31 .8% 


14.8% 


35.6% 


Percent with no college 


65.4% 


61 .5% 


65.3% 


61 .0% 


Percent poor 


14.4% 


12.4% 


14.3% 


12.2% 


Median household income 


$31,945 


$36,039 


$32,040 


$36,579 


Propensity score 


0.665 


0.363 


0.641 


0.351 


Number of districts 


286 


236 


320 


202 



Districts that were participating in the Career Ladder program during the analysis period 
were also more likely to have started participating prior to the analysis period. Table 2 
includes an average propensity score for participating and non-participating districts, which 
is the predicted probability of the district participating in 1994-95, based on the district’s 
1994-95 characteristics. The primary determinants of the propensity score are the state 



3 The propensity score is generated from a district-level logit regression with 1994-95 district data. The 
dependent variable is an indicator for district Career Ladder participation in 1994-95, and the explanatory 
variables are percent African-American, percent Hispanic, percent economically disadvantaged, log of 
enrollment, percent urban, percent with no college, percent poor, log of median household income, and the 
district’s state Career Ladder match rate. 
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matching rate, 4 with districts that have higher matching rates being more likely to participate, 
and the percent urban and percent with no college education, with urban and highly- 
educated districts being less likely to participate. In both 1997-98 and 2004-05, districts that 
were participating in the program were approximately 30 percent more likely to have been 
doing so in 1994-95 than non-participating districts, based on their 1994-95 characteristics. 

Missouri implemented statewide standardized testing in 1996-97 with the introduction 
of the MAP math test in grades 4, 8, and 10. In 1997-98 Missouri added a MAP reading test 
in grades 3, 7, and 11, and continued to test math and reading in these grades through the 
2004-05 school year. We use the district average scale scores for math and reading through 
this time period, with the analysis covering 1996-97 through 2004-05 for math and 1997-98 
through 2004-05 for reading. The MAP scale scores vary from 450 to 910 across grade 
levels and subjects. 

In both 1997-98 and 2004-05, the average math and reading test scores for districts that 
were participating in the Career Ladder program were quite similar to those for districts that 
were not. Table 3 compares average test scores across participating and non-participating 
districts in 1997-98 and 2004-05, separately by grade and subject. Non-participating districts 
had higher average test scores in most grades and subjects in both years, although the 
differences are quite small relative to the standard deviations and the gap in average test 
scores is never statistically significant at the 10 percent level. 



4 Prior to 1996 the state matching rate on Career Ladder expenditures was determined by district property 
value per pupil, and ranged from 90 percent for the lowest property value districts to 35 percent for the highest 
property value districts. 
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Table 3. District Average Achievement Levels 



Subject 


Grade 


1997-98 


2004-05 


Participating 

Districts 


Non- 

participating 

Districts 


Participating 

Districts 


Non- 

participating 

Districts 


Math 


4 


636.6 


637.2 


644.3 


644.9 






(12.0) 


(12.4) 


(11.6) 


(13.3) 


Math 


8 


695.5 


696.2 


704.9 


706.2 






(15.1) 


(14.7) 


(11.7) 


(15.3) 


Math 


10 


718.2 


720.2 


737.3 


736.9 






(14.0) 


(15.5) 


(13.4) 


(15.3) 


Reading 


3 


635.1 


633.9 


641.9 


641.2 






(10.4) 


(10.9) 


(9.7) 


(11.3) 


Reading 


7 


668.2 


670.1 


676.9 


677.4 






(11.3) 


(12.9) 


(9.4) 


(10.9) 


Reading 


11 


704.2 


707.1 


712.1 


713.4 






(13.1) 


(10.9) 


(8.1) 


(8.1) 


Number of Districts 


286 


236 


320 


202 



Standard deviations in parentheses 




III. Methods 



W e can think about modeling achievement impacts with a student-level cumulative 
achievement function, in which current student achievement depends on school 
and family inputs. 5 Ideally such a model would use student-level longitudinal data 
spanning the entire analysis period, in order to compare the achievement trajectories of 
individual students as they enter or exit treatment status. In our analysis, treatment status 
varies at the school district level because treatment is exposure to a regime in which teachers 
are offered the possibility of a bonus if they qualify and then perform certain activities. 

Therefore, we model district average student achievement in each time period as a 
function of the district’s Career Ladder participation status, as well as other observable 
characteristics of the district. The estimation equation is: 

( 1 ) 4 igt =CL it _ 1 + X it + PS i + 0 gt +S i + £it 

where A igt is the average achievement level in district i for grade g in year t, CL it4 is Career 
Ladder participation status for district i in year t-1, X it is a set of district demographic control 
variables, PS; is the district’s 1994-95 propensity score (explained below), 0 is a set of grade 
by year indicators, 8, are district random effects, and s it is the random error term. Because 
there are unobserved factors that vary between participating and non-participating districts 
but not necessarily over time, we include district random effects in the error stmcture to 
improve the precision of the coefficient estimates by accounting for the district-specific 
component of variance. 

We did not include lagged test scores as a control variable on the right hand side of the 
estimating equation. A reason for this is that prior to 2004, Missouri did not require districts 
to test their students in consecutive grades. This intermittent testing design makes it difficult 
to explicitly model achievement growth in the way that most student-level achievement 
models require. Over the analysis period students would have at most two test scores in a 



5 See Todd and Wolpin (2005) for more details on cumulative achievement functions. 
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subject, spaced two to four years apart, making a comparison of achievement trajectories 
before and after treatment mostly uninformative. 

Districts can choose each year whether or not to participate in the Career Ladder 
program, so naive comparisons of participating with non-participating districts will 
confound selection effects with program impacts. That is, districts’ career ladder 
participation status, CL it may be correlated with unobserved determinants of student 
achievement, represented in our model by 8 ; and s it and therefore endogenous. If 
participation is endogenous, the estimated achievement effect would be biased. The 
estimates could be downward biased, for example, if districts only participated out of fear of 
falling test scores, or could be upward biased if districts participated because they were 
reform-minded in general. Our qualitative research (see Silman et al. forthcoming) suggests 
that teachers themselves were the main force behind district participation decisions and that 
the program was simply viewed as a way to augment salaries. 

One way to address the problem of endogenous participation is to use propensity score 
methods that attempt to balance the distribution of observable characteristics among Career 
Ladder and non-Career Ladder districts. The propensity score is the probability that a 
district would have started participating before 1995-96, which we estimate using a logistic 
regression. We include this propensity score as a control variable in the model and also use 
it to form subgroups of observably similar comparison and treatment schools. 

After adjusting for the propensity score, the mean characteristics of participating and 
non-participating districts in 1997-98 are no longer significantly different. The first two 
columns of Table 4 show the unadjusted mean characteristics of participating and non- 
participating districts in 1997-98, and the last two columns show the mean characteristics 
after regressing each characteristic variable on the district propensity score. Participating 
districts are on average significantly different from non-participating districts in most 
respects, but when the district characteristics are adjusted for the propensity score, none of 
the differences is statistically significant, and most of the difference between the mean 
characteristics is eliminated. For instance, participating districts have a mean percent African 
American of 2.0 percent and non-participating districts a mean of 7.6 percent, a difference of 
5.6 percent, but after adjusting for the propensity score this difference shrinks to 0.6 percent. 

Because there are many ways to specify such a model, we identified a benchmark model 
that we believe is most plausible, and later use sensitivity tests to examine the robustness of 
the benchmark results according to different modeling or specification assumptions. The 
benchmark estimation specification models the average district math and reading test scores 
as a function of district demographics, community characteristics, and the propensity score 
variable. The model includes district random effects to account for the district-specific 
component of variance, as well as grade-by-year fixed effects. The treatment variable is an 
indicator for the district participating in the Career Ladder program in the prior year. We 
use lagged Career Ladder participation because the effect of a district’s participation in the 
program is likely to be strongest on student achievement in the following year. We omitted 
67 comparison districts with very low propensity scores from the model, since their 
characteristics are quite different from those of any of the participating districts. 
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Table 4. Propensity Score Adjusted Characteristics of Participating and Non- 
participating Districts, 1997-98 





Unadjusted Means 


Propensity Score Adjusted 
Means 


Average Characteristics 


Participating 

Districts 


Non- 

participating 

Districts 


Participating 

Districts 


Non- 

participating 

Districts 


Enrollment 


1,108* 


2,411* 


1,494 


1,943 


Percent white 


96.7%* 


91.1%* 


94.4% 


93.8% 


Percent African-American 


2.0%* 


7.6%* 


4.2% 


4.8% 


Percent Hispanic 


0.7% 


0.7% 


0.7% 


0.7% 


Percent economically 
disadvantaged 


43.5%* 


38.4%* 


40.9% 


41.1% 


Percent in a large or mid-size 
city 


8.0%* 


19.9%* 


13.4% 


13.5% 


Percent in a large or small town 


17.5% 


1 8.2% 


19.4% 


15.8% 


Percent in a rural area 


74.5%* 


61 .9%* 


67.2% 


70.7% 


Percent urban 


15.5%* 


31 .8%* 


21 .7% 


24.4% 


Percent with no college 


65.4%* 


61 .5%* 


63.9% 


63.4% 


Percent poor 


14.4%* 


12.4%* 


13.5% 


13.5% 


Median household income 


$31 ,945* 


$36,039* 


$33,578 


$34,061 


Number of districts 


286 


236 


286 


236 



indicates difference is significant at 5% level 




IV. Results 



A. Main Findings 

Our best estimates of the average effect the Career Ladder Program has had on 
achievement across the three tested grade levels are significantly positive but small for math 
scores and not significantly different from zero for reading scores. The estimates, presented 
in Table 5, are reported in “effect size” units, which represent the fraction of a standard 
deviation at the district level in the distribution of student scores. 6 An effect size of 0.066 
for math and 0.043 for reading suggest that a district’s participation in Career Ladder is 
associated with an increase in scores of 6.6 percent and 4.3 percent of standard deviation in 
the distribution, respectively, in the distribution of mean test scores across districts. For 
comparison, the coefficient on the district’s percentage of students who are economically 
disadvantaged is -0.828, so a ten point decrease in percent disadvantaged would be 
associated with an increase in average test scores of 0.083 of a standard deviation. Or, 
correspondingly, the effect of Career Ladder participation on math scores is comparable to a 
7 percentage point reduction in the district’s percentage of economically disadvantaged 
students, holding all else equal. In terms of student-level effect sizes, the effect size of 0.066 
is equivalent to approximately 0.02 standard deviations in the student level distribution of 
scores. (Test scores vary considerably more across students than do average scores across 
districts). 

Although the estimated Career Ladder effect on math scores is statistically significant, 
caution is needed in making causal inferences. Because districts choose whether to 
participate in the program, there may be other confounding differences between Career 
Ladder and non-Career Ladder districts that are impossible to fully control for, which could 
bias the estimated achievement effects. Additionally, a difference of less than one tenth of a 
standard deviation in test scores, even when it is statistically significant, is quite a small 
effect. 



6 To derive effect size estimates, we standardized all scores so that the distribution of average test scores 
has a mean of zero and a standard deviation of one within each grade level, subject, and year. 
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Table 5. Benchmark Specification 





Math 


Reading 


Overall CL Effect 


.066** 


.043 


CL Effect (enrollment < 1600) 


.060* 


.042 


CL Effect (enrollment 1600-5000) 


.075 


.023 


CL Effect (enrollment > 5000) 


.203 


.137 


Number of districts 


454 


454 


Number of district-grade-year observations 


10,425 


10,013 



* indicates significance at 10%, ** at 5%, *** at 1% 

When we looked for a differential Career Ladder effect by district size, we found that 
the achievement effect is largest for the large districts, but these size effects are not 
statistically significant. Table 5 presents the results from interacting the district Career Ladder 
participation indicator with an indicator for district size, where districts are divided into three 
size categories based on their K-12 student enrollment (<1600, 1600-5000, or 5000+). If 
there is a fixed cost to a district for participating in the Career Ladder program, there could 
be more benefit for a large district to participate, since they have more teachers to benefit 
from the program, although we learned from qualitative research (see Silman et al. 
forthcoming) that the program has very little fixed cost. The funding rules starting in 1996- 
97 required a lower state matching rate for large districts, so their participation required a 
greater district contribution per teacher than did the participation of smaller districts. 
Nevertheless, the vast majority of both participating and non-participating districts fall in the 
small-district category, and the small-district math effect is positive and statistically 
significant at the ten percent level. 

The results differ when we disaggregate by grade level (Table 6). For both math and 
reading the point estimates of the Career Ladder participation effect is largest for the 
elementary grade (grade 4 in math, grade 3 in reading), with an effect of 0.124 in math and a 
0.100 effect in reading, each approximately twice as large as the effect for all grades 
combined, and statistically significant at the five percent level. For math, the 8th grade effect 
is also significant, with an effect of 0.082. For both subjects the elementary grade effect is 
statistically significantly different from the high school grade effect. The results of 
interacting the Career Ladder indicator with district size show a similar pattern, namely the 
largest effects occur for the elementary grades and for districts with high enrollments. 

We can only speculate on the reasons for a differential effect by grade level. The 
pattern of larger effects for elementary grades could be due to a greater return in test scores 
from extra instruction given in those grades. Or perhaps it is easier to affect student test 
scores at a younger age, generally and apart from extra instruction, so that any effect is more 
pronounced in elementary school. 
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Table 6. Benchmark Specification, Separately by Grade 







Math 






Reading 




Grade 

4 


Grade 

8 


Grade 

10 


Grade 

3 


Grade 

7 


Grade 

11 


Overall CL Effect 


-| 24 *** 


.082** 


.021 


. 100 ** 


.040 


-.008 


CL Effect (enrollment < 1 600) 


.116*** 


.084** 


.001 


.103** 


.036 


-.021 


CL Effect (enrollment 1600-5000) 


.115 


.061 


.099 


.056 


.031 


.032 


CL Effect (enrollment > 5000) 


.359** 


.105 


.150 


.184 


.165 


.083 


Number of districts 


454 


454 


400 


454 


454 


400 


Number of district-year observations 


3,617 


3,615 


3,193 


3,477 


3,475 


3,061 



* indicates significance at 10%, ** at 5%, *** at 1% 

B. Robustness Checks 

The findings are robust for the method by which we used propensity scores to 
construct a matched comparison. The benchmark estimates (Table 5) used the propensity 
score as a covariate. As an alternative, we used the subgroup classification method whereby 
we estimate treatment and comparison group means within specified intervals of the 
propensity score distribution and average the differences across intervals. We repeated this 
method using four and then ten equal intervals (quartiles and deciles). The results (shown in 
Table 7) lead to the same conclusion as the covariate adjustment method used to produce 
the benchmark estimates, with an average effect size of 0.074 in math and 0.064 in reading. 

The benchmark specification includes controls for observable district characteristics, 
but no control for prior test scores because Missouri did not conduct routine annual testing 
in consecutive grades. However, we were able to construct a synthetic pretest by going back 
two or more years for a given cohort, to the grade level where the students had been 
previously tested. This alternative specification uses the prior average test score for each 
cohort as a control variable. For example, for 10th grade math observations the cohort’s 
prior average test score would be the district average 8th grade math score from two years 
earlier. As one might expect, effect of the Career Ladder disappears when we use the 
pretest specification, a result that is almost entirely attributable to the composition effect, i.e. 
dropping the lowest grade, which we know from Table 6 is largely responsible for the 
positive Career Ladder effect. Table 8 presents the results including controls for cohort 
prior average test scores. In order to show how the reduced sample changes the estimated 
effects, the last two columns report results for the benchmark specification, but restricted to 
the sample of district-year observations for which there is a cohort prior test score available. 
Restricting the sample in this way, we find that the effect on reading scores is basically zero, 
and the effect on math is less positive and no longer statistically significant. It is important to 



7 For a more detailed discussion of this method see Rosenbaum and Rubin (1984). 
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note, though, that most participating districts were also participating when the pre-test was 
administered, so the pre-test could also have been affected by district Career Ladder 
participation. Including the pre-test as a control would bias the results towards zero. 



Table 7. Benchmark Specification, Separately by Propensity Score Groups 





Math 


Reading 


Lowest Quartile 


.066 


.023 


2nd Quartile 


.057 


.072 


3rd Quartile 


.105* 


-.007 


Top Quartile 


.070 


.166* 


Average Effect 


.074* 


.064 


Lowest Decile 


.049 


.015 


2nd Decile 


.123 


.080 


3rd Decile 


.024 


.003 


4th Decile 


.027 


.099 


5th Decile 


.015 


.050 


6th Decile 


.006 


-.097 


7th Decile 


.148 


.008 


8th Decile 


.032 


.104 


9th Decile 


.096 


.115 


1 0th Decile 


.187 


.409** 


Average Effect 


.070* 


.079* 


* indicates significance at 10%, 


** at 5%, *** at 1% 





Table 8. Controlling for Prior Cohort Average Test Scores 










Prior average test 
score as control 
variable 


Test score 
differences as 
dependent 
variable 


Benchmark 
specification, 
restricted to same 
sample 




Math 


Read 


Math 


Read 


Math 


Read 


Overall CL Effect 


.044 


-.003 


-.013 


.000 


.054 


-.006 


CL Effect (enrollment < 1600) 


.036 


-.020 


-.021 


-.007 


.048 


-.027 


CL Effect 

(enrollment 1600-5000) 


.058 


.025 


.011 


-.012 


.061 


.033 


CL Effect (enrollment > 5000) 


.147 


.241 


.067 


.206 


.167 


.250 


Number of districts 


454 


454 


454 


454 


454 


454 


Number of district-grade-year 
observations 


4,727 


3,127 


4,727 


3,127 


4,727 


3,127 



* indicates significance at 10%, ** at 5%, *** at 1% 




